智能论文笔记

BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus

Josh Meyer , David Ifeoluwa Adelani , Edresson Casanova , Alp Öktem , Daniel Whitenack Julian Weber , Salomon Kabongo , Elizabeth Salesky , Iroro Orife , Colin Leong , Perez Ogayo

分类：自然语言处理

2022-07-07

Bibletts是一种在撒哈拉以南非洲使用的十种语言的大型，高质量的开放语音数据集。该语料库包含每语言最多86个小时的对齐，工作室质量的48kHz单扬声器唱片，从而能够开发高质量的文本到语音模型。代表的十种语言是：Akuapem Twi，Asante Twi，Chichewa，Ewe，Hausa，Kikuyu，Lingala，Luganda，Luganda，Luo和Yoruba。该语料库是由Biblica的Open.Bible Project制作和发行的圣经录音的衍生作品。我们已经对齐，清洁和过滤了原始录音，并还对每种语言的对齐子进行了手工检查。我们为具有Coqui TTS的文本到语音模型提供了结果。数据是根据商业友好的CC-SA许可发布的。

translated by 谷歌翻译

Building Segmentation on Satellite Images and Performance of Post-Processing Methods

Metehan Yalçın , Ahmet Alp Kindiroglu , Furkan Burak Bağcı , Ufuk Uyan , Mahiye Uluyağmur Öztürk

分类：计算机视觉

2022-12-28

Researchers are doing intensive work on satellite images due to the information it contains with the development of computer vision algorithms and the ease of accessibility to satellite images. Building segmentation of satellite images can be used for many potential applications such as city, agricultural, and communication network planning. However, since no dataset exists for every region, the model trained in a region must gain generality. In this study, we trained several models in China and post-processing work was done on the best model selected among them. These models are evaluated in the Chicago region of the INRIA dataset. As can be seen from the results, although state-of-art results in this area have not been achieved, the results are promising. We aim to present our initial experimental results of a building segmentation from satellite images in this study.

translated by 谷歌翻译

Semi-Supervised Domain Adaptation for Semantic Segmentation of Roads from Satellite Images

Ahmet Alp Kindiroglu , Metehan Yalçın , Furkan Burak Bağcı , Mahiye Uluyağmur Öztürk

分类：计算机视觉

2022-12-26

This paper presents the preliminary findings of a semi-supervised segmentation method for extracting roads from sattelite images. Artificial Neural Networks and image segmentation methods are among the most successful methods for extracting road data from satellite images. However, these models require large amounts of training data from different regions to achieve high accuracy rates. In cases where this data needs to be of more quantity or quality, it is a standard method to train deep neural networks by transferring knowledge from annotated data obtained from different sources. This study proposes a method that performs path segmentation with semi-supervised learning methods. A semi-supervised field adaptation method based on pseudo-labeling and Minimum Class Confusion method has been proposed, and it has been observed to increase performance in targeted datasets.

translated by 谷歌翻译

Building Height Prediction with Instance Segmentation

Furkan Burak Bagci , Ahmet Alp Kindriroglu , Metehan Yalcin , Ufuk Uyan , Mahiye Uluyagmur Ozturk

分类：计算机视觉

2022-12-19

Extracting building heights from satellite images is an active research area used in many fields such as telecommunications, city planning, etc. Many studies utilize DSM (Digital Surface Models) generated with lidars or stereo images for this purpose. Predicting the height of the buildings using only RGB images is challenging due to the insufficient amount of data, low data quality, variations of building types, different angles of light and shadow, etc. In this study, we present an instance segmentation-based building height extraction method to predict building masks with their respective heights from a single RGB satellite image. We used satellite images with building height annotations of certain cities along with an open-source satellite dataset with the transfer learning approach. We reached, the bounding box mAP 59, the mask mAP 52.6, and the average accuracy value of 70% for buildings belonging to each height class in our test set.

translated by 谷歌翻译

Real Time Incremental Image Mosaicking Without Use of Any Camera Parameter

Suleyman Melih Portakal , Ahmet Alp Kindiroglu , Mahiye Uluyagmur Ozturk

分类：计算机视觉

2022-12-05

Over the past decade, there has been a significant increase in the use of Unmanned Aerial Vehicles (UAVs) to support a wide variety of missions, such as remote surveillance, vehicle tracking, and object detection. For problems involving processing of areas larger than a single image, the mosaicking of UAV imagery is a necessary step. Real-time image mosaicking is used for missions that requires fast response like search and rescue missions. It typically requires information from additional sensors, such as Global Position System (GPS) and Inertial Measurement Unit (IMU), to facilitate direct orientation, or 3D reconstruction approaches to recover the camera poses. This paper proposes a UAV-based system for real-time creation of incremental mosaics which does not require either direct or indirect camera parameters such as orientation information. Inspired by previous approaches, in the mosaicking process, feature extraction from images, matching of similar key points between images, finding homography matrix to warp and align images, and blending images to obtain mosaics better looking, plays important roles in the achievement of the high quality result. Edge detection is used in the blending step as a novel approach. Experimental results show that real-time incremental image mosaicking process can be completed satisfactorily and without need for any additional camera parameters.

translated by 谷歌翻译

Minimum Class Confusion based Transfer for Land Cover Segmentation in Rural and Urban Regions

Metehan Yalçın , Ahmet Alp Kındıroğlu , Furkan Burak Bağcı , Ufuk Uyan , Mahiye Uluyağmur Öztürk

分类：计算机视觉

2022-12-05

Transfer Learning methods are widely used in satellite image segmentation problems and improve performance upon classical supervised learning methods. In this study, we present a semantic segmentation method that allows us to make land cover maps by using transfer learning methods. We compare models trained in low-resolution images with insufficient data for the targeted region or zoom level. In order to boost performance on target data we experiment with models trained with unsupervised, semi-supervised and supervised transfer learning approaches, including satellite images from public datasets and other unlabeled sources. According to experimental results, transfer learning improves segmentation performance 3.4% MIoU (Mean Intersection over Union) in rural regions and 12.9% MIoU in urban regions. We observed that transfer learning is more effective when two datasets share a comparable zoom level and are labeled with identical rules; otherwise, semi-supervised learning is more effective by using the data as unlabeled. In addition, experiments showed that HRNet outperformed building segmentation approaches in multi-class segmentation.

translated by 谷歌翻译

Variational Bayes for robust radar single object tracking

Alp Sarı , Tak Kaneko , Lense H. M. Swaenen , Wouter M. Kouw

分类：计算机视觉 | 机器学习

2022-09-28

我们通过雷达来解决对象跟踪以及处理异常值的当前最新方法的鲁棒性。标准跟踪算法从雷达图像空间中提取检测到在过滤阶段使用它。过滤由卡尔曼过滤器进行，该滤波器假设高斯分布式噪声。但是，此假设并不能说明大型建模错误，并导致突然动作期间的跟踪性能差。我们将高斯总和过滤器（多假设跟踪器的单对象变体）作为基线，并通过与比高斯更重的分布建模工艺噪声来提出修改。变分贝叶斯提供了一种快速，计算上便宜的推理算法。我们的模拟表明，在存在过程离群值的情况下，稳健的跟踪器在跟踪单个对象时优于高斯总和过滤器。

translated by 谷歌翻译

Calibrating Ensembles for Scalable Uncertainty Quantification in Deep Learning-based Medical Segmentation

Thomas Buddenkotte , Lorena Escudero Sanchez , Mireia Crispin-Ortuzar , Ramona Woitek , Cathal McCague , James D. Brenton , Ozan Öktem , Evis Sala , Leonardo Rundo

分类：机器学习 | 计算机视觉

2022-09-20

自动图像分析中的不确定性定量在许多应用中高度满足。通常，分类或细分中的机器学习模型仅用于提供二进制答案。但是，量化模型的不确定性可能在主动学习或机器人类互动中起关键作用。当使用基于深度学习的模型时，不确定性量化尤其困难，这是许多成像应用中最新的。当前的不确定性量化方法在高维实际问题中不能很好地扩展。可扩展的解决方案通常依赖于具有不同随机种子的相同模型的推理或训练集合过程中的经典技术，以获得后验分布。在本文中，我们表明这些方法无法近似分类概率。相反，我们提出了一个可扩展和直观的框架来校准深度学习模型的合奏，以产生近似分类概率的不确定性定量测量。在看不见的测试数据上，我们证明了与标准方法进行比较时的校准，灵敏度（三种情况中的两种）以及精度。我们进一步激发了我们在积极学习中的方法的用法，创建了伪标签，以从未标记的图像和人机合作中学习。

translated by 谷歌翻译

Deep Learning for Material Decomposition in Photon-Counting CT

Alma Eguizabal , Ozan Öktem , Mats U. Persson

分类：机器学习

2022-08-05

光子计数CT（PCCT）通过更好的空间和能量分辨率提供了改进的诊断性能，但是开发可以处理这些大数据集的高质量图像重建方法是具有挑战性的。基于模型的解决方案结合了物理采集的模型，以重建更准确的图像，但取决于准确的前向操作员，并在寻找良好的正则化方面遇到困难。另一种方法是深度学习的重建，这在CT中表现出了巨大的希望。但是，完全数据驱动的解决方案通常需要大量的培训数据，并且缺乏解释性。为了结合两种方法的好处，同时最大程度地降低了各自的缺点，希望开发重建算法，以结合基于模型和数据驱动的方法。在这项工作中，我们基于展开/展开的迭代网络提出了一种新颖的深度学习解决方案，用于PCCT中的材料分解。我们评估了两种情况：一种学识渊博的后处理，隐含地利用了模型知识，以及一种学到的梯度，该梯度在体系结构中具有明确的基于模型的组件。借助我们提出的技术，我们解决了一个具有挑战性的PCCT模拟情况：低剂量，碘对比度和很小的训练样品支持的腹部成像中的三材料分解。在这种情况下，我们的方法的表现优于最大似然估计，一种变异方法以及一个完整的网络。

translated by 谷歌翻译

FedHeN: Federated Learning in Heterogeneous Networks

Durmus Alp Emre Acar , Venkatesh Saligrama

分类：机器学习

2022-07-07

我们为通过异质网络提供了一种新颖的培训配方，用于联合学习，每个设备都可以具有不同的体系结构。我们介绍了培训，并以较高复杂性的设备为附带目标，以在联合环境中共同培训不同的体系结构。我们从经验上表明，与最先进的方法相比，我们的方法改善了不同架构的性能，并导致沟通节省高。

translated by 谷歌翻译